Content-based Image Retrieval
   HOME

TheInfoList



OR:

Content-based image retrieval, also known as query by image content ( QBIC) and content-based visual information retrieval (CBVIR), is the application of
computer vision Computer vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate tasks that the hum ...
techniques to the
image retrieval An image retrieval system is a computer system used for browsing, searching and retrieving images from a large database of digital images. Most traditional and common methods of image retrieval utilize some method of adding metadata such as caption ...
problem, that is, the problem of searching for
digital image A digital image is an image composed of picture elements, also known as ''pixels'', each with ''finite'', '' discrete quantities'' of numeric representation for its intensity or gray level that is an output from its two-dimensional functions ...
s in large
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases sp ...
s (see this survey
Content-based Multimedia Information Retrieval: State of the Art and Challenges
' (Original source, 404'
''Content-based Multimedia Information Retrieval: State of the Art and Challenges''
, Michael Lew, et al.,
ACM Transactions on Multimedia Computing, Communications, and Applications ''ACM Transactions on Multimedia Computing, Communications, and Applications'' (''TOMM'') is a quarterly scientific journal that aims to disseminate the latest findings of note in the field of multimedia computing. It is published by the Associatio ...
, pp. 1–19, 2006.
for a scientific overview of the CBIR field). Content-based image retrieval is opposed to traditional concept-based approaches (see
Concept-based image indexing Concept-based image indexing, also variably named as "description-based" or "text-based" image indexing/retrieval, refers to retrieval from text-based indexing of images that may employ keywords, subject headings, captions, or natural language text ...
). "Content-based" means that the search analyzes the contents of the image rather than the
metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
such as keywords, tags, or descriptions associated with the image. The term "content" in this context might refer to colors, shapes, textures, or any other information that can be derived from the image itself. CBIR is desirable because searches that rely purely on metadata are dependent on
annotation An annotation is extra information associated with a particular point in a document or other piece of information. It can be a note that includes a comment or explanation. Annotations are sometimes presented in the margin of book pages. For anno ...
quality and completeness. Having humans manually annotate images by entering keywords or metadata in a large database can be time-consuming and may not capture the keywords desired to describe the image. The evaluation of the effectiveness of keyword image search is subjective and has not been well-defined. In the same regard, CBIR systems have similar challenges in defining success. "Keywords also limit the scope of queries to the set of predetermined criteria." and, "having been set up" are less reliable than using the content itself.


History

The term "content-based image retrieval" seems to have originated in 1992 when it was used by Japanese
Electrotechnical Laboratory The , or AIST, is a Japanese research facility headquartered in Tokyo, and most of the workforce is located in Tsukuba Science City, Ibaraki, and in several cities throughout Japan. The institute is managed to integrate scientific and engineeri ...
engineer Toshikazu Kato to describe experiments into automatic retrieval of images from a database, based on the colors and shapes present. Since then, the term has been used to describe the process of retrieving desired images from a large collection on the basis of syntactical image features. The techniques, tools, and algorithms that are used originate from fields such as statistics, pattern recognition, signal processing, and computer vision.


- Query By Image Content

The earliest commercial CBIR system was developed by IBM and was called QBIC (Query By Image Content). Recent network and graph based approaches have presented a simple and attractive alternative to existing methods. While the storing of multiple images as part of a single entity preceded the term
BLOB Blob may refer to: Science Computing * Binary blob, in open source software, a non-free object file loaded into the kernel * Binary large object (BLOB), in computer database systems * A storage mechanism in the cloud computing platform M ...
(Binary Large OBject), the ability to fully search by content, rather than by description had to await IBM's QBIC.


Technical progress

The interest in CBIR has grown because of the limitations inherent in metadata-based systems, as well as the large range of possible uses for efficient image retrieval. Textual information about images can be easily searched using existing technology, but this requires humans to manually describe each image in the database. This can be impractical for very large databases or for images that are generated automatically, e.g. those from
surveillance camera A closed-circuit television camera can produce images or recordings for surveillance or other private purposes. Cameras can be either video cameras, or digital stills cameras. Walter Bruch was the inventor of the CCTV camera. The main purpose o ...
s. It is also possible to miss images that use different synonyms in their descriptions. Systems based on categorizing images in semantic classes like "cat" as a subclass of "animal" can avoid the miscategorization problem, but will require more effort by a user to find images that might be "cats", but are only classified as an "animal". Many standards have been developed to categorize images, but all still face scaling and miscategorization issues. Initial CBIR systems were developed to search databases based on image color, texture, and shape properties. After these systems were developed, the need for user-friendly interfaces became apparent. Therefore, efforts in the CBIR field started to include human-centered design that tried to meet the needs of the user performing the search. This typically means inclusion of: query methods that may allow descriptive semantics, queries that may involve user feedback, systems that may include machine learning, and systems that may understand user satisfaction levels.


Techniques

Many CBIR systems have been developed, but , the problem of retrieving images on the basis of their pixel content remains largely unsolved. Different query techniques and implementations of CBIR make use of different types of user queries.


Query By Example

QBE ( Query By Example) is a query technique that involves providing the CBIR system with an example image that it will then base its search upon. The underlying search algorithms may vary depending on the application, but result images should all share common elements with the provided example. Options for providing example images to the system include: * A preexisting image may be supplied by the user or chosen from a random set. * The user draws a rough approximation of the image they are looking for, for example with blobs of color or general shapes. This query technique removes the difficulties that can arise when trying to describe images with words.


Semantic retrieval

''Semantic'' retrieval starts with a user making a request like "find pictures of Abraham Lincoln". This type of open-ended task is very difficult for computers to perform - Lincoln may not always be facing the camera or in the same pose. Many CBIR systems therefore generally make use of lower-level features like texture, color, and shape. These features are either used in combination with interfaces that allow easier input of the criteria or with databases that have already been trained to match features (such as faces, fingerprints, or shape matching). However, in general, image retrieval requires human feedback in order to identify higher-level concepts.


Relevance feedback (human interaction)

Combining CBIR search techniques available with the wide range of potential users and their intent can be a difficult task. An aspect of making CBIR successful relies entirely on the ability to understand the user intent. CBIR systems can make use of '' relevance feedback'', where the user progressively refines the search results by marking images in the results as "relevant", "not relevant", or "neutral" to the search query, then repeating the search with the new information. Examples of this type of interface have been developed.


Iterative/machine learning

Machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
and application of iterative techniques are becoming more common in CBIR.


Other query methods

Other query methods include browsing for example images, navigating customized/hierarchical categories, querying by image region (rather than the entire image), querying by multiple example images, querying by visual sketch, querying by direct specification of image features, and multimodal queries (e.g. combining touch, voice, etc.)


Content comparison using image distance measures

The most common method for comparing two images in content-based image retrieval (typically an example image and an image from the database) is using an image distance measure. An image distance measure compares the similarity of two images in various dimensions such as color, texture, shape, and others. For example, a distance of 0 signifies an exact match with the query, with respect to the dimensions that were considered. As one may intuitively gather, a value greater than 0 indicates various degrees of similarities between the images. Search results then can be sorted based on their distance to the queried image. Many measures of image distance (Similarity Models) have been developed.


Color

Computing distance measures based on color similarity is achieved by computing a
color histogram In image processing and photography, a color histogram is a representation of the distribution of colors in an image. For digital images, a color histogram represents the number of pixels that have colors in each of a fixed list of color ranges, ...
for each image that identifies the proportion of pixels within an image holding specific values. Examining images based on the colors they contain is one of the most widely used techniques because it can be completed without regard to image size or orientation. However, research has also attempted to segment color proportion by region and by spatial relationship among several color regions.


Texture

Texture Texture may refer to: Science and technology * Surface texture, the texture means smoothness, roughness, or bumpiness of the surface of an object * Texture (roads), road surface characteristics with waves shorter than road roughness * Texture (c ...
measures look for visual patterns in images and how they are spatially defined. Textures are represented by texels which are then placed into a number of sets, depending on how many textures are detected in the image. These sets not only define the texture, but also where in the image the texture is located. Texture is a difficult concept to represent. The identification of specific textures in an image is achieved primarily by modeling texture as a two-dimensional gray level variation. The relative brightness of pairs of pixels is computed such that degree of contrast, regularity, coarseness and directionality may be estimated. The problem is in identifying patterns of co-pixel variation and associating them with particular classes of textures such as ''silky'', or ''rough''. Other methods of classifying textures include: *
Co-occurrence matrix A co-occurrence matrix or co-occurrence distribution (also referred to as : ''gray-level co-occurrence matrices'' GLCMs) is a matrix that is defined over an image to be the distribution of co-occurring pixel values (grayscale values, or colors) a ...
* Laws texture energy *
Wavelet transform In mathematics, a wavelet series is a representation of a square-integrable (real number, real- or complex number, complex-valued) function (mathematics), function by a certain orthonormal series (mathematics), series generated by a wavelet. This ...
*
Orthogonal transforms (Discrete Tchebichef moments) In mathematics, orthogonality is the generalization of the geometric notion of '' perpendicularity''. By extension, orthogonality is also used to refer to the separation of specific features of a system. The term also has specialized meanings in ...


Shape

Shape does not refer to the shape of an image but to the shape of a particular region that is being sought out. Shapes will often be determined first applying segmentation or
edge detection Edge detection includes a variety of mathematical methods that aim at identifying edges, curves in a digital image at which the image brightness changes sharply or, more formally, has discontinuities. The same problem of finding discontinuitie ...
to an image. Other methods use shape filters to identify given shapes of an image. Shape descriptors may also need to be invariant to translation, rotation, and scale. Some shape descriptors include: *
Fourier transform A Fourier transform (FT) is a mathematical transform that decomposes functions into frequency components, which are represented by the output of the transform as a function of frequency. Most commonly functions of time or space are transformed, ...
* Moment invariant


Vulnerabilities, attacks and defenses

Like other tasks in
computer vision Computer vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate tasks that the hum ...
such as recognition and detection, recent neural network based retrieval algorithms are susceptible to adversarial attacks, both as candidate and the query attacks. It is shown that retrieved ranking could be dramatically altered with only small perturbations imperceptible to human beings. In addition, model-agnostic transferable adversarial examples are also possible, which enables black-box adversarial attacks on deep ranking systems without requiring access to their underlying implementations. Conversely, the resistance to such attacks can be improved via adversarial defenses such as the Madry defense.


Image retrieval evaluation

Measures of image retrieval can be defined in terms of
precision and recall In pattern recognition, information retrieval, object detection and classification (machine learning), precision and recall are performance metrics that apply to data retrieved from a collection, corpus or sample space. Precision (also called ...
. However, there are other methods being considered.


Image retrieval in CBIR system simultaneously by different techniques

An image is retrieved in CBIR system by adopting several techniques simultaneously such as Integrating Pixel Cluster Indexing, histogram intersection and discrete wavelet transform methods.


Applications

Potential uses for CBIR include: * Architectural and engineering design *
Art collections A museum is distinguished by a collection of often unique objects that forms the core of its activities for exhibitions, education, research, etc. This differentiates it from an archive or library, where the contents may be more paper-based, repla ...
*
Crime prevention Crime prevention is the attempt to reduce and deter crime and criminals. It is applied specifically to efforts made by governments to reduce crime, enforce the law, and maintain criminal justice. Studies Criminologists, commissions, and research b ...
*
Geographical information Geographic data and information is defined in the ISO/TC 211 series of standards as data and information having an implicit or explicit association with a location relative to Earth (a geographic location or geographic position). It is also call ...
and
remote sensing Remote sensing is the acquisition of information about an object or phenomenon without making physical contact with the object, in contrast to in situ or on-site observation. The term is applied especially to acquiring information about Earth ...
systems *
Intellectual property Intellectual property (IP) is a category of property that includes intangible creations of the human intellect. There are many types of intellectual property, and some countries recognize more than others. The best-known types are patents, cop ...
*
Medical diagnosis Medical diagnosis (abbreviated Dx, Dx, or Ds) is the process of determining which disease or condition explains a person's symptoms and signs. It is most often referred to as diagnosis with the medical context being implicit. The information re ...
*
Military A military, also known collectively as armed forces, is a heavily armed, highly organized force primarily intended for warfare. It is typically authorized and maintained by a sovereign state, with its members identifiable by their distinct ...
* Photograph archives * Retail catalogs * Nudity-detection filters * Face Finding * Textiles Industry Commercial Systems that have been developed include: * IBM's QBIC * Virage's VIR Image Engine * Excalibur's Image RetrievalWare * VisualSEEk and WebSEEk * Netra * MARS * Vhoto * Pixolution Experimental Systems include: * MIT's Photobook * Columbia University's WebSEEk * Carnegie-Mellon University's Informedia * iSearch - PICT


See also

*
Document classification Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a document to one or more classes or categories. This may be done "manually" (or "intellectually") ...
* GazoPa *
Image retrieval An image retrieval system is a computer system used for browsing, searching and retrieving images from a large database of digital images. Most traditional and common methods of image retrieval utilize some method of adding metadata such as caption ...
*
List of CBIR engines This is a list of publicly available Content-based image retrieval Content-based image retrieval, also known as query by image content ( QBIC) and content-based visual information retrieval (CBVIR), is the application of computer vision techniqu ...
*
Macroglossa Visual Search Macroglossa was a visual search engine based on the comparison of images, coming from an Italian Group. The development of the project began in 2009. In April 2010 is released the first public alpha. Users can upload photos or images that they are ...
*
MPEG-7 MPEG-7 is a multimedia content description standard. It was standardized in ISO/ IEC 15938 (Multimedia content description interface). This description will be associated with the content itself, to allow fast and efficient searching for material th ...
*
Multimedia information retrieval Multimedia information retrieval (MMIR or MIR) is a research discipline of computer science that aims at extracting semantic information from multimedia data sources.H Eidenberger. ''Fundamental Media Understanding'', atpress, 2011, p. 1. Data sour ...
* Multiple-instance learning *
Nearest neighbor search Nearest neighbor search (NNS), as a form of proximity search, is the optimization problem of finding the point in a given set that is closest (or most similar) to a given point. Closeness is typically expressed in terms of a dissimilarity function ...
*
Learning to rank Learning to rank. Slides from Tie-Yan Liu's talk at WWW 2009 conference aravailable online or machine-learned ranking (MLR) is the application of machine learning, typically supervised, semi-supervised or reinforcement learning, in the construct ...


References


Further reading


Relevant research papers

*
Query by Image and Video Content: The QBIC System
', (Flickner, 1995) *
Finding Naked People
' (Fleck et al., 1996) *
Virage Video Engine
', (Hampapur, 1997) *
Library-based Coding: a Representation for Efficient Video Compression and Retrieval
', (Vasconcelos & Lippman, 1997) *
System for Screening Objectionable Images
' (Wang et al., 1998) *
Content-based Image Retrieval
' (
JISC Jisc is a United Kingdom not-for-profit company that provides network and IT services and digital resources in support of further and higher education institutions and research as well as not-for-profits and the public sector. History T ...
Technology Applications Programme Report 39) (Eakins & Graham 1999) *
Windsurf: Region-Based Image Retrieval Using Wavelets
' (Ardizzoni, Bartolini, and Patella, 1999) *
A Probabilistic Architecture for Content-based Image Retrieval
', (Vasconcelos & Lippman, 2000) *
A Unifying View of Image Similarity
', (Vasconcelos & Lippman, 2000) *
Next Generation Web Searches for Visual Content
', (Lew, 2000) *
Image Indexing with Mixture Hierarchies
', (Vasconcelos, 2001) *
SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture Libraries
' (Wang, Li, and Wiederhold, 2001) *
A Conceptual Approach to Web Image Retrieval
' (Popescu and Grefenstette, 2008) *
FACERET: An Interactive Face Retrieval System Based on Self-Organizing Maps
' (Ruiz-del-Solar et al., 2002) *
Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach
' (Li and Wang, 2003) *
Video google: A text retrieval approach to object matching in videos
' (Sivic & Zisserman, 2003) *
Minimum Probability of Error Image Retrieval
' (Vasconcelos, 2004) *
On the Efficient Evaluation of Probabilistic Similarity Functions for Image Retrieval
' (Vasconcelos, 2004) *
Extending image retrieval systems with a thesaurus for shapes
' (Hove, 2004) *
Names and Faces in the News
' (Berg et al., 2004) *
Cortina: a system for large-scale, content-based web image retrieval
' (Quack et al., 2004) *
A new perspective on Visual Information Retrieval
' (Eidenberger 2004) *
Language-based Querying of Image Collections on the basis of an Extensible Ontology
' (Town and Sinclair, 2004) *
The PIBE Personalizable Image Browsing Engine
' (Bartolini, Ciaccia, and Patella, 2004) *

' (Jaffre 2005) *
Automatic Face Recognition for Film Character Retrieval in Feature-Length Films
' (Arandjelovic & Zisserman, 2005) *
Meaningful Image Spaces
' (Rouw, 2005) *
Content-based Multimedia Information Retrieval: State of the Art and Challenges
' (Lew ''et al.'' 2006) *
Adaptively Browsing Image Databases with PIBE
' (Bartolini, Ciaccia, and Patella, 2006) *
Algorithm on which Retrievr (Flickr search) and imgSeek is based on
' (Jacobs, Finkelstein, Salesin) *
Imagination: Exploiting Link Analysis for Accurate Image Annotation
' (Bartolini and Ciaccia, 2007) *
Evaluating Use of Interfaces for Visual Query Specification.
' (Hove, 2007) *
From Pixels to Semantic Spaces: Advances in Content-Based Image Retrieval
' (Vasconcelos, 2007) *
Content-based Image Retrieval by Indexing Random Subwindows with Randomized Trees
' (Maree et al., 2007) *
Image Retrieval: Ideas, Influences, and Trends of the New Age
' (Datta et al., 2008) *
Real-Time Computerized Annotation of Pictures
' (Li and Wang, 2008) *
Query Processing Issues in Region-based Image Databases
' (Bartolini, Ciaccia, and Patella, 2010) *
Shiatsu: Semantic-based Hierarchical Automatic Tagging of Videos by Segmentation Using Cuts
' (Bartolini, Patella, and Romani, 2010) *
Efficient and Effective Similarity-based Video Retrieval
' (Bartolini and Romani, 2010) *
Multi-dimensional Keyword-based Image Annotation and Search
' (Bartolini and Ciaccia, 2010) *
The Windsurf Library for the Efficient Retrieval of Multimedia Hierarchical Data
' (Bartolini, Patella, and Stromei, 2011) *
Pl@ntNet: Interactive plant identification based on social image data
(Joly, Alexis et al.) *
Content based Image Retrieval
' (Tyagi Vipin, 2017) *
Superimage: Packing Semantic-Relevant Images for Indexing and Retrieval
' (Luo, Zhang, Huang, Gao, Tian, 2014) *
Indexing and searching 100M images with Map-Reduce
' (Moise, Shestakov, Gudmundsson, and Amsaleg, 2013)


External links

* - the original article
IJMIR
many CBIR-related articles
Search by Drawing


{{DEFAULTSORT:Content-Based Image Retrieval Applications of computer vision Applications of artificial intelligence Image search